130 research outputs found
Self-Supervised Light Field Reconstruction Using Shearlet Transform and Cycle Consistency
The image-based rendering approach using Shearlet Transform (ST) is one of
the state-of-the-art Densely-Sampled Light Field (DSLF) reconstruction methods.
It reconstructs Epipolar-Plane Images (EPIs) in image domain via an iterative
regularization algorithm restoring their coefficients in shearlet domain.
Consequently, the ST method tends to be slow because of the time spent on
domain transformations for dozens of iterations. To overcome this limitation,
this letter proposes a novel self-supervised DSLF reconstruction method,
CycleST, which applies ST and cycle consistency to DSLF reconstruction.
Specifically, CycleST is composed of an encoder-decoder network and a residual
learning strategy that restore the shearlet coefficients of densely-sampled
EPIs using EPI reconstruction and cycle consistency losses. Besides, CycleST is
a self-supervised approach that can be trained solely on Sparsely-Sampled Light
Fields (SSLFs) with small disparity ranges ( 8 pixels). Experimental
results of DSLF reconstruction on SSLFs with large disparity ranges (16 - 32
pixels) from two challenging real-world light field datasets demonstrate the
effectiveness and efficiency of the proposed CycleST method. Furthermore,
CycleST achieves ~ 9x speedup over ST, at least
Learning Wavefront Coding for Extended Depth of Field Imaging
Depth of field is an important factor of imaging systems that highly affects
the quality of the acquired spatial information. Extended depth of field (EDoF)
imaging is a challenging ill-posed problem and has been extensively addressed
in the literature. We propose a computational imaging approach for EDoF, where
we employ wavefront coding via a diffractive optical element (DOE) and we
achieve deblurring through a convolutional neural network. Thanks to the
end-to-end differentiable modeling of optical image formation and computational
post-processing, we jointly optimize the optical design, i.e., DOE, and the
deblurring through standard gradient descent methods. Based on the properties
of the underlying refractive lens and the desired EDoF range, we provide an
analytical expression for the search space of the DOE, which is instrumental in
the convergence of the end-to-end network. We achieve superior EDoF imaging
performance compared to the state of the art, where we demonstrate results with
minimal artifacts in various scenarios, including deep 3D scenes and broadband
imaging
Fast and Accurate Depth Estimation from Sparse Light Fields
We present a fast and accurate method for dense depth reconstruction from
sparsely sampled light fields obtained using a synchronized camera array. In
our method, the source images are over-segmented into non-overlapping compact
superpixels that are used as basic data units for depth estimation and
refinement. Superpixel representation provides a desirable reduction in the
computational cost while preserving the image geometry with respect to the
object contours. Each superpixel is modeled as a plane in the image space,
allowing depth values to vary smoothly within the superpixel area. Initial
depth maps, which are obtained by plane sweeping, are iteratively refined by
propagating good correspondences within an image. To ensure the fast
convergence of the iterative optimization process, we employ a highly parallel
propagation scheme that operates on all the superpixels of all the images at
once, making full use of the parallel graphics hardware. A few optimization
iterations of the energy function incorporating superpixel-wise smoothness and
geometric consistency constraints allows to recover depth with high accuracy in
textured and textureless regions as well as areas with occlusions, producing
dense globally consistent depth maps. We demonstrate that while the depth
reconstruction takes about a second per full high-definition view, the accuracy
of the obtained depth maps is comparable with the state-of-the-art results.Comment: 15 pages, 15 figure
Optical modelling of accommodative light field display system and prediction of human eye responses
The spatio-angular resolution of a light field (LF) display is a crucial
factor for delivering adequate spatial image quality and eliciting an
accommodation response. Previous studies have modelled retinal image formation
with an LF display and evaluated whether accommodation would be evoked
correctly. The models were mostly based on ray-tracing and a schematic eye
model, which pose computational complexity and inaccurately represent the human
eye population's behaviour. We propose an efficient wave-optics-based framework
to model the human eye and a general LF display. With the model, we simulated
the retinal point spread function (PSF) of a point rendered by an LF display at
various depths to characterise the retinal image quality. Additionally,
accommodation responses to rendered LF images were estimated by computing the
visual Strehl ratio based on the optical transfer function (VSOTF) from the
PSFs. We assumed an ideal LF display that had an infinite spatial resolution
and was free from optical aberrations in the simulation. We tested images
rendered at 0--4 dioptres of depths having angular resolutions of up to 4x4
viewpoints within a pupil. The simulation predicted small and constant
accommodation errors, which contradict the findings of previous studies. An
evaluation of the optical resolution of the rendered retinal image suggested a
trade-off between the maximum resolution achievable and the depth range of a
rendered image where in-focus resolution is kept high. The proposed framework
can be used to evaluate the upper bound of the optical performance of an LF
display for realistically aberrated eyes, which may help to find an optimal
spatio-angular resolution required to render a high quality 3D scene.Comment: 24 pages, 12 figures, submitted to Optics Expres
3D-DCT based perceptual quality assessment of stereo video
ABSTRACT In this paper, we present a novel stereoscopic video quality assessment method based on 3D-DCT transform. In our approach, similar blocks from left and right views of stereoscopic video frames are found by block-matching, grouped into 3D stack and then analyzed by 3D-DCT. Comparison between reference and distorted images are made in terms of MSE calculated within the 3D-DCT domain and modified to reflect the contrast sensitive function and luminance masking. We validate our quality assessment method using test videos annotated with results from subjective tests. The results show that the proposed algorithm outperforms current popular metrics over a wide range of distortion levels
Densely-sampled light field reconstruction
In this chapter, we motivate the use of densely-sampled light fields as the representation which can bring the required density of light rays for the correct recreation of 3D visual cues such as focus and continuous parallax and can serve as an intermediary between light field sensing and light field display. We consider the problem of reconstructing such a representation from few camera views and approach it in a sparsification framework. More specifically, we demonstrate that the light field is well structured in the set of so-called epipolar images and can be sparsely represented by a dictionary of directional and multi-scale atoms called shearlets. We present the corresponding regularization method, along with its main algorithm and speed-accelerating modifications. Finally, we illustrate its applicability for the cases of holographic stereograms and light field compression.acceptedVersionPeer reviewe
- …